190        Bioinformatics

As shown in Figure 5.15, the library sizes or the sequencing depths of the six samples are

similar. This bar chart gives an idea about the distribution of the library sizes and any

potential source of bias from the library sizes.

In the normalization step, we normalized the count data to eliminate composition biases

between libraries. We can assess the TMM normalization by the MD plot (mean-difference

plot), which displays the library size-adjusted log-fold change (difference) between two

libraries against the average log-expression across these libraries (the mean). The points on

the MD plot should be centered at a line of zero log-fold change if the biases between librar-

ies were removed successfully by the normalization. The “plotMD(y, column=i)” function

creates MD plot by converting the count (y) to log2-CPM values and then creating an

artificial array by averaging all samples other than the sample specified (column=i) in the

FIGURE 5.16  Mean-difference plots.